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Both severe acute respiratory syndrome coronavirus 
(SARS-CoV) and Middle East respiratory syndrome coro- 
navirus (MERS-CoV) are zoonotic pathogens that crossed 
the species barriers to infect humans. The mechanism of 
viral interspecies transmission is an important scientific 
question to be addressed. These coronaviruses contain a 
surface-located spike (S) protein that initiates infection by 
mediating receptor-recognition and membrane fusion 
and is therefore a key factor in host specificity. In addition, 
the S protein needs to be cleaved by host proteases before 
executing fusion, making these proteases a second de- 
terminant of coronavirus interspecies infection. Here, we 
summarize the progress made in the past decade in 
understanding the cross-species transmission of SARS- 
CoV and MERS-CoV by focusing on the features of the S 
protein, its receptor-binding characteristics, and the 
cleavage process involved in priming. 


Coronavirus spike protein: a major viral determinant in 
interspecies transmission 

Coronaviruses (CoVs) are large, enveloped, positive-sense, 
single-stranded RNA viruses that can infect both animals 
and humans [1]. The viruses are further subdivided, based 
on genotypic and serological characters, into four genera: 
Alpha-, Beta-, Gamma-, and Deltacoronavirus [2,3]. Thus 
far, all identified CoVs that can infect humans belong to the 
first two genera. These include the alphacoronaviruses 
(alphaCoVs) hCoV-NL63 and hCoV-229E and the betacor- 
onaviruses (betaCoVs) HCoV-OC43, HKU1, SARS-CoV, 
and MERS-CoV [1,4,5]. Special attention has been paid to 
betaCoVs, which have caused two unexpected coronaviral 
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epidemics in the past decade [6]. In 2002-2003, SARS-CoV 
first emerged in China and swiftly spread to other parts of 
the world, leading to >8000 infection cases and ~800 deaths 
[6]. In 2012, a novel CoV, named MERS-CoV, was identified 
in the Middle East [4,5]. The virus managed to spread to 
multiple countries despite intense human interventions, 
causing 1110 infections and 422 related deaths as of 29 April 
2015 (http://www.who.int/csr/disease/coronavirus_in 
fections/archive_updates/en/). Both SARS-CoV and MERS- 
CoV are zoonotic pathogens originating from animals. They 
are believed to have been transmitted from a natural host, 
possibly originating from bats, to humans through some 
intermediate mammalian hosts [7,8]. Thus, determining 
how these viruses evolved to cross species barriers and to 
infect humans is an active area of CoV research. 

The key determinant of the host specificity of a CoV is 
the surface-located trimeric spike (S) glycoprotein, which 
can be further divided into an N-terminal S1 subunit and a 
membrane-embedded C-terminal S2 region [1]. S1 specia- 
lizes in recognizing host-cell receptors and is normally 
more variable in sequence among different CoVs than is 
the S2 region [1,9]. Two discrete domains that can fold 
independently are located in the S1 N- and C-terminal 
portions, both of which can be used for receptor engage- 
ment [10]. The N-terminal domain (NTD), functioning as 
the entity involved in receptor recognition, is exemplified 
by murine hepatitis virus (MHV), which utilizes carci- 
noembryonic antigen cell-adhesion molecules (CEACAMs) 
for cell entry [11,12]. In most CoVs, however, the receptor- 
binding domain (RBD) is found in the S1 C-terminus 
[10,13-17]. In such cases, the NTD might facilitate the 
initial attachment of the virus to the cell surface by recog- 
nizing specific sugar molecules [18—21]. The S1-receptor 
interaction is therefore a key factor determining the tissue 
tropism and host range of CoVs. 

Following receptor binding via S1, the CoV S2 functions 
to mediate fusion between the viral and the cellular mem- 
branes [1]. With characteristics of type I fusion proteins, 
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CoV S2 normally contains multiple key components, in- 
cluding one or more fusion peptides and two conserved 
heptad repeats (HRs), driving membrane penetration and 
virus—cell fusion [1]. The fusion peptides are proposed to 
insert into, and perturb, the targeted membranes 
[22,23]. The HRs can trimerize into a coiled-coil structure 
and drag the virus envelope and the host cell bilayer into 
close proximity, preparing for fusion to occur [24—28]. It is 
notable that the CoV S protein is commonly cleaved by host 
proteases to liberate S2 and the fusion peptides from the 
otherwise covalently-linked S1 subunit. This so-called 
priming process is highly dependent on the spatiotemporal 
patterns of the host enzymes, which is another key factor 
affecting cell tropism and the entry route of CoVs [29]. 

In this review, we first summarize the features of the S 
protein, the receptor-binding characteristics, the priming 
cleavage process, and the interspecies transmission mecha- 
nisms of SARS-CoV. Previous research on these topics has 
made SARS-CoV one of the best studied natural models ofa 
viral disease emerging from zoonotic sources. Special atten- 
tion will then be paid to MERS-CoV, focusing on the progress 
of the research made in the past several years regarding 
each of these items. We also retrospectively review several 
recent studies on bat coronaviruses (BatCoVs), which could 
implicate a zoonotic origin of MERS-CoV. 
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The SARS-CoV S glycoprotein, its cleavage priming and 
interaction with ACE2, and viral interspecies 
transmission 

SARS-CoV S is a 1255-residue glycoprotein; it is suggested 
to be cleaved either between R667 and S668 by trypsin, or 
between T678 and M679 by endosomal cathepsin L, into S1 
and S2 subunits [30,31], although the functional relevance 
of T678 in virus—cell fusion remains to be fully investigat- 
ed. Several important modules in both S1 and S2 have been 
systematically characterized thus far (Figure 1A,B). The 
SARS-CoV RBD is found in the C-terminal portion of S1, 
which spans ~220 amino acids (Figure 1A). It is composed 
of two subdomains: a core and an external subdomain 
[13]. The core has a center B-sheet composed of five anti- 
parallel strands, which are further surrounded by the 
polypeptide loops connecting the strands and several sur- 
face helices, together forming a globular fold. The external 
region consists mainly of two small -strands and a large 
interstrand loop and is located distally to the terminal side 
of the domain. A portion of the interstrand loop extends 
extensively over the surface of the core subdomain, and, 
together with the two B-strands, anchors the external 
region to the core like a clamp (Figure 1B). It is interesting 
that one structure of the free SARS-CoV RBD unexpectedly 
revealed the possible dimerization of the protein through 
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Figure 1. Severe acute respiratory syndrome coronavirus (SARS-CoV) spike features. (A) Schematic representation of the SARS-CoV spike protein (S). The individual 
components of S that were either experimentally characterized in previous studies — including receptor-binding domain (RBD), fusion peptide (FP), internal fusion peptide 
(IFP), heptad repeat 1/2 (HR1/2), and pretransmembrane domain (PTM) [13,27,35] — or are based on bioinformatics analyses, for example, N-terminal domain (NTD), are 
marked with the boundary-residue numbers listed below. The S1/S2 cleavage sites and the S2’-recognition site are highlighted. Other abbreviations: SP, signal peptide; TM, 
transmembrane domain; and CP, cytoplasmic domain. (B) Atomic structures of SARS-CoV spike RBD, FP, IFP, HR1/HR2 complex, and PTM (from left to right). The crystal 
structures of RBD (core subdomain in green and external subdomain in magenta) and the six-helix bundle fusion core (consisting of three HR1/HR2 helical hairpins in green, 
cyan, and magenta, respectively) are shown as ribbons, while the solution NMR structures of FP, IFP, and PTM are contoured using the electrostatic surface. (C) The 
complex structure between SARS-CoV RBD and its receptor ACE2. The core and external subdomains of RBD and the N- and C-terminal lobes of ACE2 are colored green, 
magenta, cyan, and orange, respectively. (D) The amino acid interactions at the RBD-ACE2 interface. According to a previous study [13], this binding network involves at 
least 18 residues in the receptor and 14 residues in SARS-CoV RBD, which are listed and connected with solid lines. Black lines indicate van der Waals contacts, and red lines 
represent H-bond or salt-bridge interactions. 
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its terminal side [32]. The biological relevance of this 
structural observation, however, remains to be investigat- 
ed. The authors suggest that RBD dimerization might 
cross-link S trimers on the viral surface, thereby affecting 
virus stability and infectivity. With systematic structural 
studies on SARS-CoV RBD, the structure of the SARS-CoV 
S NTD is still not known. It should be noted that this NTD, 
unlike its counterparts in bovine coronavirus (BCoV) or 
HCoV-OC43 [20,21], cannot recognize sugar moieties on 
mucin [12]. 

To enter host cells, SARS-CoV needs to first bind to the 
cell-surface receptor ACE2 [33] via the viral RBD 
[13]. ACE2 is a type I membrane glycoprotein and contains 
a large N-terminal ectodomain built of two a-helical lobes 
[13,34]. The complex structure of SARS-CoV RBD bound to 
ACE2 revealed that the viral RBD utilizes its external 
subdomain to exclusively engage the N-terminal lobe of the 
receptor (Figure 1C). Residues 424—494 (which are also 
referred to as the receptor-binding motif or RBM because 
they make all of the contacts with the receptor) in the RBD 
external region present an elongated and gently concave 
outer surface, cradling the most N-terminal helix in ACE2. 
In addition, the two ridges of this RBM further interact 
with the receptor by contacting the a2/a3 interhelical loops 
on one side and a B-hairpin and a small helix on the other 
[13]. The buried surface area upon complex formation is 
927.8 A? in the SARS-CoV RBD and 884.7 A? in ACE2, 
respectively. The interface involves at least 18 residues in 
the receptor and 14 residues in RBD, forming a network of 
hydrophilic contacts that are suggested to predominate in 
the RBD/ACE2 interactions (Figure 1D) [13]. 

After binding to ACE2, fusion between the SARS-CoV 
envelope and the host cell membrane is executed by the S2 
subunit. Multiple fusion-related components in SARS-CoV 
S2 have been extensively studied thus far (Figure 1A,B). 
These include the fusion core composed of HR1 and HR2 
[27,28] and at least three membranotropic regions that are 
denoted as the fusion peptide (FP), internal fusion peptide 
(IFP), and pretransmembrane domain (PTM), respectively 
[35]. The two HR modules are separately dispatched in S2 
and are separated from each other by ~200 residues. They 
form a coiled-coil structure built of three HR1-HR2 helical 
hairpins (Figure 1B) [27,28], presenting as a canonical six- 
helix bundle, as observed in other typical type I fusion 
proteins such as HIV gp41 [36] and Ebola GP [37]. The HR 
regions are further flanked by the three membranotropic 
components. Both FP and IFP are located upstream of 
HR1, spanning residues 770—788 and 873-888, respective- 
ly, while PTM is distally downstream of HR2 and directly 
precedes the transmembrane domain of SARS-CoV S. All 
of these three components are able to partition into the 
phospholipid bilayer to disturb membrane integrity [38], 
and their structural features have recently been elucidated 
[35]. FP assumes an a-helical conformation but shows 
significant distortion at its center. In contrast, IFP exhibits 
a straight a-helical structure. PTM assumes a helix—loop— 
helix fold. It should be noted that all three components can 
create a hydrophobic side-surface (Figure 1B), explaining 
their bilayer-binding capacities [35]. The exact role of these 
putative fusion peptides in virus—cell fusion, however, 
remains to be fully examined; for example, it is currently 
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unknown whether FP, IFP, and PTM function individually 
or in a synergistic manner. The evolutionary reservation of 
these hydrophobic amino acid sequences in SARS-CoV S 
highlights their potential participation in the viral entry 
process. 

The priming process of SARS-CoV S by host proteases is 
likely one of the best characterized so far for viral envelope 
proteins. Indeed, the proteolytic activation mechanisms 
are summarized in several excellent reviews [29,39,40]. 
What has been astonishing is that this viral protein can be 
primed via a diverse array of proteases. Due to the lack ofa 
furin-recognizable site, SARS-CoV S is largely uncleaved 
after biosynthesis [30]. It can be later processed by endo- 
somal cathepsin L during entry, enabling SARS-CoV in- 
fection via the endocytosis pathway [41]. In addition, the 
viral S can also be activated by extracellular enzymes such 
as trypsin, thermolysin, and elastase, which are shown to 
induce syncytia formation and virus entry, possibly at the 
plasma surface [42]. Other proteases that are of potential 
biological relevance in potentiating SARS-CoV S include 
TMPRSS2, TMPRSS11la, and HAT [43-45], which are 
localized on the cell surface and are highly expressed in 
the human airway [46]. It is also noteworthy that 
TMPRSS2 can associate with ACE2 to form a receptor— 
protease complex, enabling efficient virus entry directly at 
the cell surface [47]. Echoing the important role of 
TMPRSS2 in SARS-CoV infection, a recent study further 
indicated that serine proteases (e.g., TMPRSS2) but not 
cysteine proteases (e.g., cathepsin L) are required for 
SARS-CoV spread in vivo [48]. Furthermore, TMPRSS2 
as well as other host enzymes, such as HAT and ADAM17, 
are also indicated in the shedding of human ACE2 recep- 
tor, which, in turn, was shown to promote the uptake of 
virus particles [49,50]. Remarkably, SARS-CoV S also 
contains an S?’ cleavage site downstream of the S1/S2 
boundary [51-53]. This second cleavage event is believed 
to be crucial for the final activation of S, and the sequence 
directly C-terminal to S2’ displays characteristics of a 
viral-fusion peptide and plays an important role in medi- 
ating fusion [54]. It is still unknown how the cleavage of S 
at S1/S2 or S2’, the insertion of the fusion peptides into 
target membranes, and the assembly of HR regions are 
combined together as concerted events to complete mem- 
brane fusion (e.g., whether these events occur following 
specific spatiotemporal patterns). It should be noted that 
SARS-CoV FP, which spans residues 770-788, would be 
separated from the HR regions after proteolytic cleavage at 
S2’. This indicates a scenario of membrane fusion with 
chronological steps such that FP initially targets the host 
cell membranes to facilitate the following bilayer insertion 
of IFP, which remains conjugated with the HR regions 
after S2’ proteolysis. Such a scenario also highlights the 
importance of including multiple fusion peptides in SARS- 
CoV S for virus entry. 

The interspecies transmission route of SARS-CoV is 
well established. Mounting evidence shows that the natu- 
ral hosts of the virus are bats [55-57]. This notion was 
initially supported by the successful identification of 
SARS-like coronaviruses (SL-CoVs) in bats. Nevertheless, 
these viruses contain amino acid deletions in the S-RBM 
region and are unable to interact with human ACE2 
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[55,56]. Recently, Ge et al. successfully isolated an infec- interactions which can ‘tolerate’ relatively large variations 
tious SL-CoV in Chinese horseshoe bats that shows far in the receptor. The inability of ACE2 of a certain species 
more sequence conservation in S to SARS-CoV than previ- functioning as the SARS-CoV receptor, therefore, likely 


ously identified SL-CoVs do [56] and can recognize both bat arises from combinations of certain mutations. For exam- 
and human ACE2 as the receptor [57], providing solid ple, the mutation incorporating a potential N-glycosylation 
evidence for the bat origin of SARS-CoV. Palm civets site at N82 in conjugation with the K353H substitution in 
and raccoon dogs were identified as the replication hosts rat ACE2, but not a single M82N mutation as observed in 
for SARS-CoV [58], although it is still a matter of debate hamster ACE2, abrogate the receptor’s binding capacity 
whether the virus is transmitted from bats to humans for SARS-CoV S. It is also notable that ACE2s of different 
directly or via these intermediate animals. The ACE2 bat species behave differently regarding serving as the 
receptors of civets and raccoon dogs, however, can faithful- receptor for SARS-CoV [59]. ACE2 of Chinese rufous horse- 
ly be recognized by SARS-CoV S [59-61]. Mouse ACE2 can shoe bat Rhinolophus sinicus, but not that of Pearson’s 
also be utilized by SARS-CoV but with much less efficiency horseshoe bat Rhinolophus pearsonii, supports S-mediated 
than the human receptor [62]. This is because the mouse SARS-CoV infection [59], although the receptor proteins of 
receptor contains a Lys-to-His mutation at position the two species both contain seven mutations in the RBD- 
353 and is therefore devoid of a key hydrophilic interaction interfacing region (Table 1). The structural basis underly- 
rendered by the lysine residue [13]. Rat ACE2 also harbors ing this observed difference remains to be illustrated. 
this K353H mutation. In addition, it has an extra glycosyl- The S adaptation for binding to the human receptor is 
ation site at position 82. The linked carbohydrate moieties also well recorded for SARS-CoV. Comparison of the RBD 
are proposed to sterically occlude binding of SARS-CoV sequences of SARS-CoV isolated from humans and civets 
RBD to the rat receptor [13]. In support of this, deletion of | revealed six residue-substitutions [67], among which three 
the glycan, together with the H353K substitution, restores (at positions 472, 479, and 487, respectively) belong to the 
RBD-binding to the rat receptor [63,64]. In light of the 14-interfacing-residue list (Figure 1D). K479N and S487T 
inefficiency of SARS-CoV RBD in recognizing the mouse mutations have been reported in several studies [64,68,69] 
and rat receptors, it is unlikely that these two species are as the key changes in adapting SARS-CoV RBD for the 
involved in the SARS-CoV zoonosis. human receptor. S protein with the civet-specific K479 and 
It is noteworthy that, of the 18 ACE2 residues interfac- S487 residues can efficiently recognize civet ACE2 but 
ing with SARS-CoV RBD, multiple (>7) amino acid sub- interacts with human ACE2 much less efficiently [64]. Sub- 


stitutions are observed in the civet and raccoon receptors, stitution of these two amino acids with the human-specific 
in contrast to the receptors in other infection-permissive N479 and T487, either individually or in combination, 
species [such as monkey (African green monkey), macaque, dramatically increases the affinity of S for the human 
marmoset, hamster, and cat] (reviewed in [65]) that con- receptor [64,68]. This increased binding affinity is believed 
tain <4 mutations in the region (Table 1). Furthermore, to be related to the elimination of unfavorable free charges 


ferret ACE2 (with nine substitutions relative to the human at the interface upon mutation [70] and the extra contacts 
homologue) was mutated for half of the interface residues established by the methyl group of T487 [71]. Residue 
(Table 1) but can still be recognized by SARS-CoV changes at other positions in the RBM might also be 
S [66]. These observations indicate plastic RBD/ACE2 related to the SARS-CoV adaption. For instance, a virus 


Table 1. Comparison among different species of the ACE2 residues interfacing with severe acute respiratory syndrome 
coronavirus (SARS-CoV) receptor-binding domain (RBD)* 
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aThe 18 residues in human ACE2 that are identified to interface with SARS-CoV RBD were listed and compared for the conservatism in different species. The letters in red 
highlight the amino acid mutations at the corresponding positions, which are based on human ACE2 numbering. The ACE2 receptors that can be recognized by the SARS- 
CoV S protein include those from human, monkey (African green monkey), macaque, marmoset, hamster, cat, civet, raccoon dog, ferret, mouse, and bat (Rhinolophus 
sinicus, R. sinicus), although the mouse and bat (R. sinicus) ACE2s are utilized inefficiently. The rat and bat (Rhinolophus pearsonii, R. pearsonii) receptors, however, are 
unable to be used by SARS-CoV. Accession numbers: human (AY623811), monkey (AY996037), macaque (NM_001135696), marmoset (XM_008988993), hamster 
(XM_005074209), cat (NM_001039456), civet (AY881174), raccoon (AB211998), ferret (AB208708), mouse (EF408740), bat (R. sinicus) (GQ999936), rat (AY881244), bat 
(R. pearsonii) (EF569964). 
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bearing the civet S with the K479N mutation was passaged 
on human airway epithelial cells. Adaptive substitution 
occurred at residues 442 and 472, rather than at the 
487 site identified in the epidemic strains [69]. The changes 
in SARS-CoV S required for interspecies transmission are 
also exemplified in two independent studies on mouse- 
adapted viruses. Two groups identified the same S-substi- 
tution at position 486, which is believed to be directly 
linked to the enhanced infectivity and pathogenesis in 
the murine host [72,73]. 


MERS-CoV S, its cleavage priming and interaction with 
CD26, and viral interspecies transmission 

MERS-CoV S is composed of 1353 residues and displays a 
remarkably similar domain arrangement to its SARS-CoV 
homologue (Figure 2A), although the overall sequence 
identity between the two viral proteins is rather limited. 
However, unlike SARS-CoV S, the MERS-CoV S protein 
can be readily processed into S1 and S2 subunits upon 
expression [74—76]. In S1, the receptor-recognizing RBD is 
localized to the C-terminal portion, spanning ~240 resi- 
dues [16,17,77]. These amino acids fold into a structure 
consisting of two subdomains, as reported in the SARS- 
CoV equivalent. The core subdomain presents remarkable 
similarities to that of the SARS-CoV RBD, but the external 
subdomain is structurally distinct from the SARS-CoV 
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RBD external region and comprises mainly four antipar- 
allel B-strands (Figure 2B). In S2, the HR regions are also 
well studied [26,78]. As expected, the HR1 and HR2 of 
MERS-CoV also form an intra-hairpin helical structure 
that can trimerically assemble into a six-helix bundle 
(Figure 2B), demonstrating a canonical membrane-fusion 
mechanism as reported for other type I fusion proteins 
[24]. These studies provide insight into the characteristics 
of MERS-CoV S. Nevertheless, other S-components of this 
novel CoV remain largely uninvestigated. For example, it 
is still unknown whether the RBD-preceding NTD of 
MERS-CoV S1 might similarly fold into a galectin-like 
structure (as in MHV [12]) and function to facilitate the 
initial viral attachment to the cell surface by recognizing 
certain sugar molecules (as in BCoV and HCoV-OC43 
[20,21]). In addition, the S2 fusion peptides of MERS- 
CoV must also be experimentally investigated, although 
similar concentration of hydrophobic residues to the SARS- 
CoV FP, IFP, and PTM can be individually identified in the 
equivalent regions of MERS-CoV S (Figure 2B). 
MERS-CoV initiates human infection by first specifically 
interacting with its receptor CD26 (also known as dipeptidyl 
peptidase 4 or DPP4) [79]. CD26 is a membrane-bound 
peptidase with a type II topology and can form homodimers 
on the cell surface [80-82]. Its ectodomain structurally 
comprises two domains, an a/B-hydrolase domain and an 
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Figure 2. Middle East respiratory syndrome coronavirus (MERS-CoV) spike features. (A) Schematic representation of the MERS-CoV spike protein. The boundaries for the 
individual components, as well as the S1/S2 and S2’ cleavage sites, are marked. Abbreviations: SP, signal peptide; NTD, N-terminal domain; RBD, receptor-binding domain; 
FP, fusion peptide; IFP, internal fusion peptide; HR1/2, heptad repeat 1/2; PTM, pre-transmembrane domain; TM, transmembrane domain; and CP, cytoplasmic domain. 
Question marks highlight the fusion peptides (FP, IFP, and PTM) of MERS-CoV that still await structural and functional characterization. (B) Crystal structures of the MERS- 
CoV spike RBD and HR1/HR2 fusion core. Left panel: the RBD structure with its core subdomain highlighted in green and external subdomain in magenta. Middle-left panel: 
a structural superimposition between MERS-CoV RBD (core and external subdomains in green and magenta, respectively) and severe acute respiratory syndrome 
coronavirus (SARS-CoV) RBD (in gray). Middle-right panel: the fusion core structure with the three HR1/HR2 chains in green, cyan, and magenta, respectively. Right panel: 
sequence comparison between SARS-CoV and MERS-CoV highlighting the spike regions of SARS-CoV FP, IFP, and PTM, respectively. Important hydrophobic residues are 
marked in boxes. (C) The complex structure between MERS-CoV RBD and the receptor CD26/DPP4. MERS-CoV RBD is colored as in panel (B), and the receptor is highlighted 
in cyan for the B-propeller domain and in orange for the a/B-hydrolase domain, respectively. The inter-blade helix referred to in the text is marked. (D) Atomic binding- 
network between MERS-CoV RBD and CD26 [16]. The RBD-CD26 interface includes 13 amino acids from the receptor and 18 residues from the virus RBD, which are 
individually connected with either black lines, for van der Waals contacts, or red lines, for H-bond or salt-bridge interactions. The CD26 residue N229 contributes to the RBD- 
binding via its linked sugar moieties rather than directly engaging RBD, and is therefore highlighted in yellow. 
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eight-bladed B-propeller [81,82]. The MERS-CoV RBD spe- 
cifically recognizes, via its external subdomain, the B-pro- 
peller of the receptor for engagement (Figure 2C) 
[16,17]. The four external B-strands of the RBD create a 
relatively flat surface to interact with the propeller blades IV 
and V. Large surface areas of 1203.4 A? in CD26 and 
1113.4 A? in MERS-CoV RBD are buried to form an extend- 
ed binding interface [16], in which 13 residues of the receptor 
and 18 amino acids of the RBD play important roles in the 
binding by providing either H-bond/salt-bridge interactions 
or multiple van-der-Waals contacts (Figure 2D). Among 
these, a strong network of hydrophilic contacts is created 
mainly with the interface-residue side-chains. In addition, a 
small hydrophobic depression in RBD further cradles the 
bulged inter-blade helix in the receptor, which presents 
several apolar side-chains (Figure 2C). Finally, the RBD 
and CD26 binding also involves a receptor-linked carbohy- 
drate entity interacting with several solvent-exposed resi- 
dues in the RBD (Figure 2D), drawing parallels between 
MERS-CoV and the alphaCoV porcine respiratory corona- 
virus. The latter also recognizes a sugar component in the 
receptor [15]. What has been unexpected regarding the 
MERS-CoV binding to CD26 is its competitive interference 
with the interaction between CD26 and adenosine deami- 
nase (ADA), which has been suggested to deliver an impor- 
tant costimulatory signal in immune activation [80]. A 
majority of the CD26 residues interfacing with MERS- 
CoV RBD are also shown to engage ADA [16,17,83]. 

The host proteases involved in the priming of MERS- 
CoV S have also been broadly studied thus far. A pioneer- 
ing study demonstrated that MERS-CoV S, unlike its 
SARS-CoV counterpart, can be efficiently cleaved after 
biosynthesis in HEK-298T cells [74]. It was recently dem- 
onstrated that the cleavage occurs at R751/S752, separat- 
ing S into S1 and S2 subunits by furin [76]. In addition, a 
second furin cleavage site (S2’) was identified in S2, up- 
stream of the putative fusion peptide that likely corre- 
sponds to SARS-CoV IFP, between R887 and S888 
(Figure 2A) [76]. With mounting evidence showing that 
processing at S?’ is an essential determinant of the intra- 
cellular site of fusion [84], a two-step activation mechanism 
for MERS-CoV entry [76] has been proposed such that the 
former cleavage occurs between S1 and S2 during the 
secretion of S protein in the endoplasmic reticulum 
(ER)-Golgi compartments, where furin is localized, and 
the latter at S2’ during virus entry into target cells. The 
other reported proteases involved in MERS-CoV S-activa- 
tion include TMPRSS2 [74,85], TMPRSS4 [86], and endo- 
somal cathepsin B and/or L [74,85]. It is noteworthy that 
MERS-CoV, similar to SARS-CoV, might use different 
activation pathways for cell entry depending on the spa- 
tiotemporal patterns of the host priming enzymes [87]. For 
example, the presence of TMPRSS2 or trypsin treatment 
can bypass the endosomal entry pathway to initiate mem- 
brane fusion at the cell surface [85,87]. 

The cross-species transmission route of MERS-CoV 
remains not well known. Nevertheless, mounting evidence 
indicates that the virus is a zoonotic pathogen which likely 
originated first in bats and was then transmitted to other 
animals (dromedary). Despite several studies documenting 
the interhuman transmission of MERS-CoV [88,89], a 
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large portion of the cases of infection cannot be directly 
linked to contacts with index patients. The genome diver- 
sity of human MERS-CoV isolates is highly suggestive of 
human infections from several independent zoonotic 
events from animal reservoirs [90,91]. The dromedary 
camel has thus far been well documented as an intermedi- 
ate host. Both MERS-CoV-specific antibodies and RNAs 
can be detected in dromedary sera and milk [92-94], and 
live viruses were recently isolated from infected camels 
[95]. Additional direct evidence of dromedary-to-human 
transmission comes from the isolation of MERS-CoVs with 
almost identical genomic sequences from patients and 
from their breeding dromedaries [96,97]. Viral gene frag- 
ments identical or quite similar to those of MERS-CoV 
have also been recovered in bats [98-100], raising again 
the possibility that the bat acts as the natural reservoir of 
MERS-CoV. An evolutionary analysis of bat CD26 genes 
indicates a long-term arms race between bats and MERS- 
related CoVs, suggesting that MERS-CoV ancestors circu- 
lated in bats for a substantial period of time [101]. It is also 
interesting to note that a recent study indicates that 
MERS-CoV may have jumped from bats to camels up to 
20 years ago in Africa, with the camels then being imported 
into the Arabian peninsula [102]. 

Multiple cells (primary or cell lines) derived from differ- 
ent species have been investigated for susceptibility to 
MERS-CoV infection. The results show that cells of rhesus 
macaque, marmoset, goat, horse, rabbit, pig, civet, camel, 
and bat — but not of mouse, hamster, and ferret — are 
permissive to MERS-CoV replication [87,103—110]. By fo- 
cusing on the list of the 13 residues that were identified as 
key interface amino acids in the receptor, it is noteworthy 
that the receptor in species of the permissive group is 
either identical to the human receptor or varies from it 
by only one or two residues, whereas the receptor of species 
in the resistant group is more variant, showing multiple 
(>5) substitutions (Table 2). The inability of MERS-CoV to 
infect mouse, hamster, and ferret should therefore be 
attributed to the inability of the virus to recognize the 
CD26s of these species, which contain too many mutations 
in the RBD-binding region. In support of this, expression of 
hamster CD26 whose variant residues are substituted with 
the equivalent human amino acids in otherwise nonper- 
missive baby hamster kidney (BHK) cells restores the viral 
infection by MERS-CoV [109]. These results demonstrate 
that the binding capacity by MERS-CoV RBD is a key 
factor determining the host susceptibility to MERS-CoV 
infection. It has yet to be determined whether dog and cat, 
which clearly belong to the second group, are resistant to 
the virus. It would be of more interest to investigate the 13- 
residue list in the future for the amino acid combinations 
that are least required for interaction with MERS-CoV 
RBD. 

It should also be noted that sheep and bovine CD26s 
contain the same two residue-variances as goat and are 
shown to mediate MERS-CoV infection of BHK cells upon 
expression [109]. Nevertheless, another study demonstrat- 
ed that cells derived from sheep and cattle are resistant to 
MERS-CoV [106], and accordingly, no MERS-CoV-specific 
antibodies were detected in the sera of 80 tested cattle and 
40 sheep in an epidemiologic survey [93]. The discrepancy 
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Table 2. Comparison among different species of the CD26 residues interfacing with Middle East respiratory syndrome coronavirus 


(MERS-CoV) receptor-binding domain (RBD)* 


Position 267 288 317 
Species 


Human 
Macaque 
Marmoset 
Cattle 
Horse 
Goat 

Pig 

Camel 
Sheep 
Rabbit 
Bat (Pipistrellus) 
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*The 13 residues in human CD26 that are identified to be key interfacing amino acids for MERS-CoV RBD binding were listed and compared for the conservatism in different 
species. The letters in red highlight the amino acid mutations at the corresponding positions, which are based on human CD26 numbering. Two groups can be identified: the 
former (permissive), including human, macaque, marmoset, cattle, horse, goat, pig, camel, sheep, rabbit and bat, has accumulated small numbers (0-2) of mutations in the 
13-residue list; whereas the latter (resistant), with cat, dog, ferret, hamster, rat and mouse, contains multiple (> 5) substitutions in the region. Accession numbers: human 
(NP_001926), macaque (NP_001034279), marmoset (XM_002749392), cattle (NM_174039), horse (XP_001494049), goat (KF574265), pig (NM_214257), camel (AHK13386), 
sheep (XP_004004709), rabbit (XP_002712206), Bat (Pipistrellus) (AGF80256), cat (NP_001009838), dog (XP_535933), ferret (KF574264), hamster (XP_007608372), rat 


(NP_036921), and mouse (NP_034204). 


in these results might reflect the difference in the priming- 
protease system between sheep/cattle cells and BHK cells. 
Although MERS-CoV can recognize sheep/cattle CD26, the 
lack of appropriate proteases for S-activation would inca- 
pacitate the membrane fusion and the subsequent virus 
entry. The hamster-derived BHK cells, on the other hand, 
are able to prime MERS-CoV S and therefore become 
infection-permissive after gaining the capacity to interact 
with MERS-CoV RBD. A similar scenario is also observed 
in mice, which can be effectively infected by MERS-CoV 
after ectopic expression of human CD26 in the animal 
[111]. Characterization in different species of the spatio- 
temporal patterns of the enzymes that prime MERS-CoVS 
represents an interesting and as-yet-unresolved issue. 
The changes in S related to MERS-CoV interspecies 
adaptation are thus far unknown. Several genetic analy- 
ses were recently conducted to characterize the evolution- 
ary status of the virus since its identification in 2012. The 
results show that the MERS-CoV RBD has largely 
remained unchanged in sequence in the circulating virus- 
es. In a study focusing on the human MERS-CoV strains, 
the authors demonstrate that only one codon of spike 
residue 1020 (located in S2) is under strong positive 
selection, despite the fact that the overall evolutionary 
rate of the virus is estimated to be 1.12 x 107? substitu- 
tions per site per year [112]. Several substitutions have 
also been detected in the S-RBM region of some MERS- 
CoV strains, including those at positions 482, 506, 509, 
and 534. Among these, only L506 plays an important role 
in CD26 binding (Figure 2D). The identified L506F muta- 
tion, however, reduces the receptor-binding capacity and 
thereby impairs viral fitness [113]. It should be noted that 
artificial selection of escape mutants with MERS-CoV 


RBD-specific antibodies can lead to the same L506F sub- 
stitution [113], raising the possibility that the naturally 
occurring residue change at this position is the conse- 
quence of host immune pressure rather than a result of 
evolution for a better affinity to CD26. Accordingly, none of 
the identified S-changes are observed in multiple genomes 
[112]. A second study analyzed the MERS-CoV sequences 
of the dromedary isolates and identified only the A520S 
substitution in the RBD [114]. Although this residue is 
located in the external subdomain, it does not directly 
contact the receptor. Therefore, it remains to be investi- 
gated whether any residue substitutions in the RBD occur 
naturally and can facilitate cross-species transmission of 
MERS-CoV by increasing the S affinity for human CD26. 
The current data indicate that the combination of the 
18 RBD amino acids listed in Figure 2D remains dominant 
in the circulating strains, both in humans and dromedar- 
ies. This seems to favor the notion that the present MERS- 
CoV RBM sequence represents one of the best CD26- 
interacting candidates. Residues that are determinant 
for MERS-CoV S preference for binding to CD26 of a 
certain species still await identification. 


BatCoV HKU4 S protein interaction with CD26 and its 
implication for the bat origin of MERS-CoV 

A large number of coronaviruses have been recorded as 
having origins in bats (at least for their genomes) [115]. How- 
ever, their public health relevance and/or evolutionary re- 
latedness to the known human-infecting coronaviruses 
remain to be examined. BatCoVs HKU4 and HKU5 have 
recently drawn increasing attention due to their close phy- 
logenetic relationship to MERS-CoV [116]. These CoVs were 
first identified as genomic sequences in 2005 in lesser 
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Figure 3. Bat coronavirus (BatCoV) HKU4 spike features. (A) Schematic representation of the HKU4 spike protein. The listed component boundaries are mostly defined 
according to the bioinformatics analyses, except for the RBD which has been experimentally characterized [75]. The cleavage sites for S1/S2 and S2’ were predicted based on the 
homology sequence comparison with other coronaviruses and are therefore labeled with question marks. Abbreviations: SP, signal peptide; NTD, N-terminal domain; RBD, 
receptor-binding domain; HR1/2, heptad repeat 1/2; TM, transmembrane domain; and CP, cytoplasmic domain. (B) Crystal structure of HKU4 RBD. The external and core 
subdomains are colored magenta and green, respectively. (C) Complex structure between HKU4 RBD and human CD26. The coloring scheme is: RBD core, green; RBD external, 
magenta; receptor B-propeller domain, cyan; and receptor a/B-hydrolase domain, orange. (D) The HKU4 RBD is suboptimal for CD26 interaction compared to Middle East 
respiratory syndrome coronavirus (MERS-CoV) RBD [75]. The 18 CD26-interfacing residues in MERS-CoV RBD, as listed in Figure 2D, were individually compared with the 
equivalent amino acids in HKU4 RBD. The numbers highlight the van der Waals contacts each residue can provide for interacting with CD26. ‘>’ indicates that the MERS-CoV 
residues are better adapted for CD26-binding, and conversely, ‘<’ implies that the HKU4 amino acids are better adapted. The residue differences are highlighted with red arrows. 


bamboo bats and Japanese pipistrelles, respectively 
[117]. Though isolation of the infectious viruses has thus 
far been unsuccessful, mounting evidence indicates that 
these two viruses are still circulating in bats [118]. Recently, 
Yang et al. [119] and our group [75] concomitantly showed 
that BatCoV HKU4, but not HKU5, can recognize human 
CD26 as a functional receptor for cell entry. HKU4 S is 
composed of 1352 residues (Figure 3A) and can readily 
interact with human CD26 [75]. But it does not contain a 
clear furin-recognition site [29] and is expressed as an intact 
protein in 293T cells, remaining uncleaved upon incorpo- 
ration into the pseudoviral envelope. Accordingly, the Bat- 
CoV HKU4 pseudovirus was unable to infect cells 
expressing human CD26 [75]. But potential trypsin-cleav- 
age sequences can be identified in two regions homologous to 
the S1/S2 and S?’ sites of other CoVs [29], and trypsin 
treatment indeed efficiently primes HKU4 S and leads to 
sufficient pseudoviral transductions [75]. These observa- 
tions revealed the fact that the inability of HKU4 S to drive 
entry into human cells (and thus, potentially, to be trans- 
mitted to humans) is due to lack of priming and not to lack of 
receptor engagement, highlighting once again the indis- 
pensability of S cleavage in coronavirus infection. Despite 
lacking recognizable sites for furin, it remains to be investi- 
gated whether HKU4 S might be activated by any other 
commonly observed priming proteases, such as TMPRSSs 
and cathepsins. Special attention should be paid to virus 
variants that are more susceptible to protease cleavage by 
host enzymes other than trypsin. 

The RBD of BatCoV HKU4, which spans residues 372— 
611 (Figure 3A), has also been structurally characterized 
[75]. It displays a fold that resembles the MERS-CoV RBD 
(Figure 3B) and utilizes a conserved receptor binding mode 
for interaction with CD26 (Figure 3C). Interestingly, of the 
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18 identified CD26-interfacing residues in MERS-CoV RBD, 
11 amino acids are mutated and 15 are suboptimal for 
receptor interaction in HKU4 RBD (Figure 3D) [75]. None- 
theless, a pseudoviral infection assay demonstrates that 
HKU4 S is able to mediate virus entry, although less effi- 
ciently than MERS-CoV S. These results indicate that 
dramatic changes at this 18-residue interface do not neces- 
sarily abrogate the interaction between viral S and CD26, 
which in return provides the space for MERS-CoV and the 
related viruses (e.g., BatCoV HKU4) to evolve to escape from 
the neutralizing antibodies targeting the RBM and to facili- 
tate interspecies transmission. It is also notable that Bat- 
CoV HKU4 exhibits better binding capacity for bat CD26 
than for human CD26 [119], but a converse CD26-interac- 
tion has been reported for MERS-CoV [119]. This implies a 
common ancestor in bats for MERS-CoV and BatCoV 
HKU4, which divergently evolved for better interaction with 
the human and bat receptors, respectively. These studies 
also indicate the need for surveillance of HKU4-related 
viruses for their cross-species potential in the future. 

It is notable that SARS-CoV seems to ‘tolerate’ large 
variations in the receptor (as illustrated in ferret ACE2 
with half of the interfacing residues being substituted). 
Small variations in the viral RBD (with N479K and 
T4878), however, can lead to altered receptor-binding spec- 
ificity, dramatically decreasing its affinity for human ACE2. 
In contrast, MERS-CoV likely only recognizes conserved 
CD26 sequences with a maximum of two mutations in the 
RBD-binding region. Nevertheless, the capacity of receptor 
engagement can still be reserved despite dramatic changes 
in the viral ligand (as demonstrated in HKU4 RBD). These 
differences could indicate different evolutionary and inter- 
species transmission routes between SARS-CoV and MERS- 
CoV, which would be an interesting issue awaiting answers. 
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Concluding remarks 

The emergence of two betaCoV-related epidemics in the 
past decade revitalized CoV research, focusing on the 
interspecies transmission mechanisms of these viruses. 
The CoV S protein is a key factor in determining viral 
tissue tropism and host range. Much progress has been 
made thus far regarding the features of S, the interaction of 
S with receptors, and the priming of S by host proteases. 
Although SARS-CoV represents one of the best studied 
models for which the cross-species transmission route has 
been well established, many questions related to MERS- 
CoV interspecies transmission remain unanswered (Box 
1). These include, but are not limited to, the structure and 
function of the S NTD, the composition of the fusion pep- 
tides, the key determinants in S for CD26 interaction, and 
the virus/host interplay determining the entry route of the 
virus. Such questions should be systematically addressed 
in the future. It is also noteworthy that all current views on 
CoV S are built on the discrete functional domains. An 
intact S structure is not available for any CoV, although 
the low-resolution electron-microscopy structure of SARS- 
CoV S has been reported [120,121]. Having an intact S 
structure with high resolution would be an interesting 
issue deserving even higher priority (Box 1). In summary, 
this review focused on our understanding of the corona- 
viral S proteins to illustrate the interspecies transmission 
basis of SARS-CoV, MERS-CoV, and beyond, the knowl- 
edge of which should be able to help prevent or predict 
further transmission events. 


Box 1. Outstanding questions 


e The fusion peptides of MERS-CoV S still await structural and 
functional characterization. Could any of these fusion peptides 
be targeted by small molecules to inhibit virus infection? 
What will be revealed by systematic and comparative studies on 
the spatiotemporal characteristics of the enzymes potentially in- 
volved in MERS-CoV S-priming among different species? 
In the list of the 13 CD26 residues that interface with the MERS-CoV 
RBD, what residue combination(s) constitute the key component 
that is indispensable in RBD-binding? The answers to this and the 
second point would enable us to predict the infection and trans- 
mission capacity of MERS-CoV in a specific species. 
Is the dromedary camel the only intermediate host of MERS-CoV, or 
are other animals also involved in the interspecies transmission of 
the virus from its natural host, possibly bat, to humans? Special 
attention should be paid to the livestock animals in the first group 
(Table 2) whose CD26 receptors are able to be recognized by MERS- 
CoV, although no evidence of these animals being infected by MERS- 
CoV has come to light thus far. In addition, pets such as cats and dogs 
in the second group (Table 2) are in close contact with humans and 
should be investigated to ensure that they do not carry MERS-CoV. 
e What S-substitutions are involved in the interspecies adaptation of 
MERS-CoV? A large-scale genomic characterization of the MERS- 
CoV isolates from human and dromedaries, and of the MERS-CoV- 
related viruses from bats, should be conducted, focusing on the 
residue changes in the receptor-binding region, to determine 
whether there are any naturally occurring mutations that enhance 
or decrease its binding capacity for human or camel CD26. It is of 
equal importance to identify, via artificial substitutions, the key 
residues determining the preference of MERS-CoV S for the CD26 
of a certain species. 
What is the role of the SARS-CoV and MERS-CoV NTD in virus 
infection? Do they share structural features with galectin, as re- 
ported in betaCoVs such as HCoV-OC43 and BCoV? 
e What do we expect to observe at the atomic level in an intact S 
trimer? An intact S structure has not been solved for any CoV. 
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